Change-Point Estimation in High Dimensional Regression Models

نویسندگان

  • Bingwen Zhang
  • Jun Geng
  • Lifeng Lai
چکیده

We consider high dimensional nonhomogeneous linear regression models with p n 9 0 or p >> n, where p is the number of features and n is the number of observations. In the model considered, the underlying true regression coefficients undergo multiple changes. Our goal is to estimate the number and locations of these change-points and estimate sparse coefficients in each of the intervals between changepoints. This paper develops an approach to solve multiple change-points estimation problem in high dimensional linear regression model based on sparse group Lasso (SGL). We analyze the performance of our approach and prove several consistency results. In particular, under certain assumptions and using a properly chosen regularization parameter, we show that the estimation errors of linear coefficients and change-point locations can be expressed as functions of n, p and s, where s is the sparse level of each coefficient. From these functions, we can understand how the estimation errors scale with system parameters and identify conditions on system parameters under which the estimation errors diminish. Furthermore, we show that the estimation of change-points is always overfitting, which eliminates the risk of missing true change-points, and the isolated estimated change-points between true change-points does not occur, which implies that the estimated change-points are clustered around the true change points. We further extend our studies to general linear models (GLM) and prove similar results. Numerical simulations are provided to illustrate the effectiveness of our approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Estimating the Time of a Step Change in Gamma Regression Profiles Using MLE Approach

Sometimes the quality of a process or product is described by a functional relationship between a response variable and one or more explanatory variables referred to as profile. In most researches in this area the response variable is assumed to be normally distributed; however, occasionally in certain applications, the normality assumption is violated. In these cases the Generalized Linear Mod...

متن کامل

Change Point Estimation of the Stationary State in Auto Regressive Moving Average Models, Using Maximum Likelihood Estimation and Singular Value Decomposition-based Filtering

In this paper, for the first time, the subject of change point estimation has been utilized in the stationary state of auto regressive moving average (ARMA) (1, 1). In the monitoring phase, in case the features of the question pursue a time series, i.e., ARMA(1,1), on the basis of the maximum likelihood technique, an approach will be developed for the estimation of the stationary state’s change...

متن کامل

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

Methods for regression analysis in high-dimensional data

By evolving science, knowledge and technology, new and precise methods for measuring, collecting and recording information have been innovated, which have resulted in the appearance and development of high-dimensional data. The high-dimensional data set, i.e., a data set in which the number of explanatory variables is much larger than the number of observations, cannot be easily analyzed by ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016